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Many quantum algorithms have daunting resource requirements when compared to what is avail¬ 
able today. To address this discrepancy, a quantum-classical hybrid optimization scheme known 
as “the quantum variational eigensolver” was developed [T] with the philosophy that even minimal 
quantum resources could be made useful when used in conjunction with classical routines. In this 
work we extend the general theory of this algorithm and suggest algorithmic improvements for prac¬ 
tical implementations. Specifically, we develop a variational adiabatic ansatz and explore unitary 
coupled cluster where we establish a connection from second order unitary coupled cluster to uni¬ 
versal gate sets through relaxation of exponential splitting. We introduce the concept of quantum 
variational error suppression that allows some errors to be suppressed naturally in this algorithm 
on a pre-threshold quantum device. Additionally, we analyze truncation and correlated sampling in 
Hamiltonian averaging as ways to reduce the cost of this procedure. Finally, we show how the use 
of modern derivative free optimization techniques can offer dramatic computational savings of up 
to three orders of magnitude over previously used optimization techniques. 


I. INTRODUCTION 

Eigenvalue and more general optimization problems lie 
at the heart of applications and technologies ranging from 
Google’s Page Rank and aircraft design to quantum sim¬ 
ulation and quantum chemistry [ng. Quantum comput¬ 
ers promise to provide ground breaking advances in our 
ability to solve these problems by offering solutions that 
may be exponentially faster than the classical equivalent 
in some cases. However, delivering on these promises may 
require overcoming considerable technological challenges. 

Since the initial proposal by Richard Feynman 0 , a 
number of advances have been made in understanding 
how to use a quantum computer to help solve eigenvalue 
and optimization problems. The quantum simulation al¬ 
gorithms of Abrams and Lloyd 00 showed how eigen¬ 
values corresponding to some Hermitian operator could 
be extracted from eigenvectors exponentially faster with 
respect to dimension than the classical equivalent. Lever¬ 
aging this idea, Aspuru-Guzik et. al. showed how one 
could perform exact quantum chemistry computations in 
polynomial time for some instances, pushing the bound¬ 
aries of predictive quantum chemistry [5]. These ideas 
have since been tested successfully in proof-of-principle 
quantum experiments using architectures such as quan¬ 
tum photonics, nitrogen vacancies in diamond, and ion 
traps 0IHI1]. 

In recent years, there has been a growing interest in 
the particular application of quantum chemistry on quan¬ 
tum computers. As a result, a number of efforts have 
been made to study the scaling and performance of var¬ 
ious algorithms while simultaneously offering dramatic 
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algorithmic improvements |lJHd()j . The original proposal 
of quantum chemistry on a quantum computer also in¬ 
troduced the idea of adiabatic state preparation, closely 
related to general adiabatic quantum computation. A 
number of advances in this field as well as extensions 
of adiabatic computation concepts to more general opti¬ 
mization problems have arisen as well mi 133132] • 
Unfortunately, despite developments in quantum algo¬ 
rithms and optimization of resource requirements, many 
of the algorithms have hardware requirements far beyond 
the capability of near-term quantum computers. More¬ 
over, the overhead of some asymptotically optimal algo¬ 
rithms is such that even the first quantum computers 
competitive with classical supercomputers may not be 
able to run them. To this end, in 2014 Peruzzo and Mc- 
Clean et al. developed the variational quantum eigen¬ 
solver (VQE), a hybrid quantum-classical algorithm de¬ 
signed to utilize both quantum and classical resources to 
find variational solutions to eigenvalue and optimization 
problems not accessible to traditional classical comput¬ 
ers 0. This algorithm was originally implemented and 
tested on a photonic quantum chip and has since been ex¬ 
tended both theoretically and experimentally to ion trap 
quantum computers [33l [33] ■ 

The VQE has the notable property that it can run on 
any quantum device, making it a candidate for explor¬ 
ing the performance of early quantum computers. More¬ 
over, the algorithm is designed to take advantage of the 
strengths of a given architecture. That is, if some gates 
or quantum operations may be performed with higher fi¬ 
delity, then the algorithm can leverage these strengths 
in the design of the quantum hardware ansatz. Perhaps 
one of the most interesting features of the algorithm is its 
ability to variationally suppress some forms of quantum 
errors, which is discussed later in this work. This intrin¬ 
sic robustness to quantum errors in combination with low 
coherence time requirements has placed this algorithm 
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as a potential candidate for the first to surpass a clas¬ 
sical computer, using a pre-threshold quantum device. 
Even in the event that some error correction is required 
to exceed current computational capabilities, this same 
robustness may translate to requiring minimal error cor¬ 
rection resources when compared with other algorithms. 

In this work we aim to present the hybrid quantum- 
classical variational approach in more detail, offering 
both theoretical and practical exposition on develop¬ 
ments since the original hybrid quantum-classical pro¬ 
posal. Additionally, although a strength of the varia¬ 
tional quantum eigensolver is its ability to adapt to the 
given hardware, this work will be the first to analyze VQE 
in the abstract, in a way that is completely general to any 
quantum device. We begin by reviewing background and 
notation as well as the outline of the variational quantum 
eigensolver algorithm. This is followed by a discussion of 
ansatz states that allow one to explore classically inac¬ 
cessible regions of Hilbert space, including a variational 
formulation of adiabatic state preparation and unitary 
coupled cluster. We then explore how this approach may 
be used to variationally suppress certain types of quan¬ 
tum errors. Following this, we introduce several com¬ 
putational enhancements to the Hamiltonian averaging 
method for obtaining expectation values, including the 
truncation of unimportant terms and grouping terms by 
commutation and covariance. These enhancements are 
able to considerably reduce the cost of the procedure. 
Finally we cover aspects of the classical optimization pro¬ 
cedure associated with the VQE and show how modern 
derivative-free optimization technique have the potential 
to greatly enhance the efficacy of the method. 


II. BACKGROUND AND NOTATION 

A. General Quantum Systems and the Variational 
Principle 

Let us consider a quantum system S composed of N 
qubits which will act as our quantum computer, and a 
Hamiltonian iJ of a different system Q that need have 
no relation to S other than acting on a space of < V 
qubits. This Hamiltonian could be derived from a physi¬ 
cal system such as a collection of interacting spins or the 
discretization of an interacting electronic system. Simi¬ 
larly it could come from the encoding of an optimization 
problem or the problem Hamiltonian in adiabatic quan¬ 
tum computation. In all of these instances, one is inter¬ 
ested in the eigenvectors and eigenvalues, |xi), K of the 
Hamiltonian H, and the goal will be to find and study 
these eigenvectors and eigenvalues using S. 

In the VQE approach, the eigenvectors are encoded 
by a set of parameters that can be used to prepare 
them on demand when other observables are desired. 
We order the eigenvectors by the eigenvalues such that 
< A 2 < ... < Aat. Indeed in many cases, the eigen¬ 
vectors corresponding to the lowest few eigenvalues and 


their properties are of primary interest. In physical sys¬ 
tems this is because low-energy states play a dominant 
role in the properties of the system at modest temper¬ 
atures, and in optimization problems they often encode 
the optimal solution. 

Recall the expectation value of an operator O with 
respect to a state |d>) 
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( 1 ) 


We will assume normalization of the wavefunction, 
('I'I'I') = 1, for the remainder of the work, however at¬ 
tention should be paid in the case of leakage errors from 
the computational basis. Our attention is restricted to 
the class of operators whose expectation value can be 
measured efficiently on S and mapped to Q. A sufficient 
condition for this property is that operators have a de¬ 
composition into a polynomial sum of simple operators 
as 


0 = Y, ( 2 ) 

a 


where O is an operator than acts on Q, a runs over a 
number of terms polynomial in the size of the system, ha 
is a constant coefficient, each Oa has a simple measure¬ 
ment prescription on the system S. This will allow for 
straightforward determination of expectation values of O 
on Q by weighted summation of projective measurements 
on the quantum device S. A simple example of this is 
the decomposition of a Hermitian operator into a sum of 
tensor products of Pauli operators weighted by constant 
coefficients. 

Consider a set of real valued parameters {di}, which 
we arrange into a vector 9, and the Hamiltonian H of 
Q. If one prepares S into a quantum state depending on 
these parameters, |5'(0)), then the variational theorem of 
quantum mechanics states that 

= (H) (9) = {^{9)\H |vI-(0)) > Ai. (3) 

As a result, the optimal choice of 9 to approximate the 
ground state (or eigenvector corresponding to the low¬ 
est eigenvalue) is the choice which minimizes {H) (9). 
Note that the state is normalized for all choices of 9 
by the unitarity of quantum evolution or trace preser¬ 
vation under quantum operations in state preparation. 
The variational principle also extends to other eigen¬ 
states. If one has constructed an ordered orthonormal 
set of k approximate eigenstates {|^i)}jLi) such that 
< ... < then 

> A Vie [l,fc] (4) 

where Xi are the ordered true eigenvalues of the operator 
H. Thus, repeated application of the variational princi¬ 
ple under orthogonality constraints can yield an approx¬ 
imation to as much of the spectrum as desired, incur¬ 
ring additional cost for each eigenvalue. Alternatively, 
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one can perform a spectral transform to the Hamiltonian 
and use the ground-state variational principle to find ex¬ 
cited states, as in the folded spectrum method [3S]. That 
is, minimize {H') {9) where H' = {H — 7 /)^ and 7 is 
some real parameter. In the transformed Hamiltonian, 
the ground state corresponds to the eigenvalue in the 
original Hamiltonian closest to 7 . 

More generally, the state preparation scheme may be 
influenced by an environment and would be better rep¬ 
resented by an ensemble given by a density matrix p{9). 
In an ideal scenario where the preparation is error free 
and a pure state is maintained, p{9) = |5'(0)) In 

the density matrix formalism, the expectation value of 
an operator O is given by 

(O)^ = Tr[pO] (5) 

and the ground state variational principle on the Hamil¬ 
tonian H still holds such that for any approximate den¬ 
sity matrix p(d), and for all choices of 9, 

= {H) (9) = Tr[pi9)H] > X,. ( 6 ) 

As a result, the optimal choice of 9 to approximate the 
ground state is that which minimizes {H) The fact 
that this principle still holds for mixed states has impor¬ 
tant consequences for the robustness of the method to 
errors and environmental influence. By finding the set 
of parameters that minimizes the energy, one is in effect, 
finding a set of experimental parameters most likely to 
produce the ground state on the average, potentially af¬ 
fecting a blind purification of the state being produced. 
This ability to suppress errors without knowledge of the 
mechanism will be elaborated upon later in this work. 

Another important quantity is the variance of an op¬ 
erator with respect to a state. For an operator O and a 
general mixed state p, this is given by 

Var[0], = ^(0-(0)J'^ (7) 

= ( 0 ")p-( 0 )^ ( 8 ) 

A variational principle on the variance exists as well, and 
has been used extensively for optimization in the context 
of quantum Monte Carlo jSS]. Note that for any eigen¬ 
state of an operator O, the variance is given by 

('kfcl jvl,,) _ (vl;| O |vl/fc)2 = {XD - {Xkf = 0 (9) 

and for any approximate eigenstate |'I'), we have that 

Var[0]|^^ > 0. (10) 


fermionic Hamiltonians. Given a set of nuclear charges Zi 
and a number of electrons, the standard form of the elec¬ 
tronic structure problem is to solve for the eigenvectors 
and eigenvalues of the electronic Hamiltonian i7, written 
as 


H = - 




^ 2 M 



^ 1 ^* - gI 


i,j>i 


ZiZj 

I ~ I 


l,j>l 



( 11 ) 


where atomic units have been used, Ri are nuclear posi¬ 
tions, Ti electronic positions, and Mi are nuclear masses. 
Due to large separations in the nuclear and electronic 
masses, an excellent approximation to this problem at 
the time and energy scales of chemical interest is to 
treat the nuclei as classical point charges under the 
Born-Oppenheimer approximation with fixed positions 
Ri- The problem as written is referred to as the first 
quantized representation of the quantum chemistry prob¬ 
lem. A number of algorithms have been developed for 
quantum computers to treat the problem directly within 
this framework [28l|37l[38], however the focus in this work 
will be on the second quantized treatment. 

To reach the practical form of the second quantized 
Hamiltonian, one must project the problem into a finite, 
orthogonal, spin-orbital basis, of which we will denote 
members and impose the requirements of fermion 
anti-symmetry through the fermion creation and annihi¬ 
lation operators o| and a^. With these steps, the second 
quantized Hamiltonian takes the form 


H — ^ ) hpqd^Qq - 1 - ^ ) hpqrsdpCl^Clr^s 


( 12 ) 


with coefficients determined by the spin orbital basis as 


hpq — 


hpqrs — 


dd ipp{(j) ( ^ - E 
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(13) 


(14) 


where ai describes both the spatial position and spin of 
an electron as ai = {ri,Si). The operators a| and ai obey 
the standard fermion commutation relations as 


{aj), dr} = d^dr + drd^ = 6p^r (15) 

{aj) , 4 } = {dp, dr} = 0 . (16) 


B. Fermionic Hamiltonians and Quantum 
Chemistry 

While the VQE and its principles can be applied to 
general quantum problems, an application of particu¬ 
lar recent interest is that of quantum chemistry and 


A crucial part of solving these problems on quan¬ 
tum computers is the mapping from fermions to qubits. 
The two most common mappings under current study 
are the Jordan-Wigner transformation [39l |40j and the 
Bravyi-Kitaev transformation [I1SI1I12]- In the case 
of the Jordan-Wigner transformation, the mapping from 
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fermion operators to qubits is 


= (11 ^ cr^)o'p 

(17) 

= (11 ^ cr^)0'p 

(18) 

= (cr^ =F *0-^) /2 

(19) 


C. Reference States 

Many traditional methods for electronic structure in¬ 
volve the concept of a reference state. A reference state is 
a product state that is used as a starting point to define 
a more general quantum state, and can allow for great 
formal simplification. Here we will briefly introduce why 
they are convenient and useful, and then how they are 
obtained. 

An example spin reference and fermion refer¬ 

ence state |4>f_ref) might be the general product states 

N 

|«'s-ref)=n(c?|0)+4 |1)) (20) 

i 

N / M 

l^f-ref) = n I ^ 

i \ J 

where |) is the fermion vacuum state, M is the num¬ 
ber of sites a fermion can occupy, and N is the number 
of qubits or fermions. Even though these are separa¬ 
ble product states, their manipulation theoretically or 
preparation on a quantum computer can be cumbersome 
as written. However, because they are product states, 
there exist efficient, local unitary basis transformations 
Us G SU{2)^" and Uf G SU{M) such that these states 
can be rotated into a simple form with weight on a single 
computational basis state. That is 

Us l^-s-ref) = I000...0) (22) 

Uf |4>f_ref) = I) • (23) 

and because the transformations are local, the transfor¬ 
mation of the Hamiltonian to the new basis such that 
the physical problem remains unchanged is also efficient. 
In the case of quantum chemistry, this corresponds to a 
transformation of the integral terms hpq and hpq^s, which 
may be computed in a time 0(M®) exactly. 

These new simpler forms of the state have advantages 
both in theoretical manipulation, and in ease of prepa¬ 
ration with quantum resources. For example, the prepa¬ 
ration of the untransformed spin reference state could 
require at least 0{N) local rotations, not including error 
correction on a quantum device to prepare from a compu¬ 
tational basis state, whereas the new reference is simply 
the computational basis state from which most computa¬ 
tions begin. Here we have traded modest classical effort 
in transforming the basis of the Hamiltonian for savings 
in quantum resources. 


These reference states are typically obtained from 
mean field calculations, which are guaranteed to have 
product states, such as those given above, as solutions. 
In chemistry, this procedure is called Hartree-Fock, and 
the transformation of the state to the simplified form is 
known as the canonical condition in the solutions of the 
Hartree-Fock equations, resulting in the canonical molec¬ 
ular orbitals. 

When the problem is well treated by mean-field the¬ 
ory, it can be shown through perturbation theory that 
the dominant corrections to the mean-field solution are 
given by quantum states “close” to the mean-field solu¬ 
tion in the sense of fermion excitations |43j or Hamming 
distance. This is the origin of the perturbative MP2 
method, configuration interaction, and coupled cluster 
methods [131 [H] , which all solve the problem close to a 
given reference and have been applied to both electronic 
and frustrated spin-systems [45] . 

In some problems, particularly when correlation is 
strong, the mean-field description is a poor starting point 
for the problem. In this case, one may still use a 
reference-like formalism, but starting with an entangled 
state. These methods are called multi-reference methods 
in quantum chemistry [HilillT], and carry consider¬ 
ably more theoretical and computational challenges with 
them. In this work, we will highlight how the general¬ 
ization of methods on a quantum computer to the multi¬ 
reference case is often more natural than in the classical 
case. 


D. Algorithm Outline 

To use a variational methodology to find approxima¬ 
tions to the eigenvalues and eigenvectors of the Hamilto¬ 
nian in a quantum computer, it is convenient to break the 
task into three distinct pieces and outline the algorithm 
very coarsely as 

1. Prepare the state 14^(0)) or p{9) on the quantum 
computer, where 6 can be any adjustable experi¬ 
mental or gate parameter. 

2. Measure the expectation value (H) {9) 

3. Use a classical non-linear optimizer such as the 
Nelder-Mead simplex method to determine new val¬ 
ues of 9 that decrease {H) {9) 

4. Iterate this procedure until convergence in the 
value of the energy. The parameters 9 at conver¬ 
gence define the desired state. 

In the coming sections we will elaborate on what is known 
about each of these steps and offer new algorithmic and 
conceptual improvements. 
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III. STATE PARAMETERIZATION AND 
PREPARATION 


The set of states a quantum computer can easily ma¬ 
nipulate that a classical computer cannot is not yet fully 
understood [15H5D] . Given the set of parameters 9, it’s 
clear that in order for a quantum computer to have an 
advantage, one would like the state |'I'( 6 *)) to be good 
at describing the solution of interest, while also difficult 
to prepare and/or sample from classically using currently 
known methods. Here we will first discuss topics relevant 
to state preparation for all classes of states in the varia¬ 
tional quantum eigensolver, independent of any notion of 
how difficult they are to prepare classically. We will then 
discuss some details concerning two classes of states cur¬ 
rently believed to be both good at describing systems of 
interest and difficult to prepare and/or sample from clas¬ 
sically, namely adiabatically parameterized states and 
(multi-reference) unitary coupled cluster states. 


A. Error bounds and distributions 

Once a state |'k(0)) has been prepared as a function 
of some set of parameters 9, one would like to know how 
close this state is to the solution of the problem being 
solved. In this work, we will say a measured value v is 
known to precision e based on a normal distribution ap¬ 
proximation with standard deviation e/ 2 , which is rea¬ 
sonable given that most of our estimates will be derived 
from sums of random variates with finite variance, which 
by the central limit will rapidly converge to a normal 
distribution. 

Suppose, for now, that the goal is to know an eigen¬ 
value of H to within a specified precision e. Let be the 
eigenvalue of H closest to (H) {9). Under these assump¬ 
tions on the eigenvalue the Weinstein inequalities mm 
hold 

{H) (9) + v/var(0) > > (H) {9) - ^Ysx{9). (24) 

As a result, a sufficient condition is to rigorously achieve 
the precision requirement e on the eigenvalue Afe is 

Var(0) < ^ (25) 

where as one approaches an eigenstate, the variance ap¬ 
proaches 0. When considering only the ground state, one 
can derive a simple bound on the quality of the state. 
More specihcally, in the zero variance limit, if Ai has 
multiplicity 1 , then the eigenstate corresponding to Ai is 
reproduced as well. That is, if a bound on the gap to the 
first eigenstate A is known in addition to the variance, 
such that |Ai — Ail > A > 0 V i 1, and e/2 < A, and 
we decompose the state into its eigenstate representation 
l^(^)) = Si g(^) Ixi) then we can quantify the quality of 


state preparation as a function of the measured variance 


l('PWIXi)l^ 


\ci{0)?> 
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^Var(0) 

"A 


(26) 


For general excited states k, one may find a similar bound 
exists based on a measurement of the variance of the 
operator and a known bound on the gap A > 0, such 
that 

= \ck{9)\^ > (27) 
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where 7 = ^A-|-y Var(0)J , and both bounds given 

here are derived in this appendix. If one has prior 
knowledge that a single eigenstate dominates the ex¬ 
pansion, such that |cfc(0)p > 0.5, and a lower bound 
0.5 < q; < |c/c(0)p, then Delos and Blinder showed 
through the method of moments that a tighter lower- 
bound on the eigenvalue is given by 

Afc > {H) 0) - (4 - Var(0). (28) 

These bounds may be used to estimate the absolute ac¬ 
curacy the minimization procedure obtained within the 
given basis and decide if the eigenvalue has been deter¬ 
mined to the desired accuracy and precision or if the state 
ansatz should be altered to adjust the cost or accuracy 
of the procedure. 


B. Adiabatically parameterized states 

One type of quantum state that can be explored as a 
parametric ansatz is that produced by adiabatic state 
state preparation with a variable path. In adiabatic 
quantum computation [54H5B] and adiabatic state prepa¬ 
ration [3 [13 one makes use of the adiabatic theorem 
which states loosely that if one prepares the lowest eigen¬ 
state of an initial Hamiltonian Hi^ by continuously chang¬ 
ing the Hamiltonian from Hi to a final problem Hamilto¬ 
nian Hp, one finishes in the lowest eigenstate of 77/ if the 
evolution was slow enough. In adiabatic computation, 
slow enough is quantified relative to the minimum eigen¬ 
value gap between the ground and first excited states 
along the evolution. While many developments have oc¬ 
curred in the area of adiabatic quantum computation and 
modifications to the Hamiltonian, perhaps the most com¬ 
monly considered form of evolution is defined by 

H{s) = A{s)H, + B{s)Hp (29) 

where s G [0,1], A(0) = 5(1) = 1, and A(l) = 5(0) = 0. 
The evolution is controlled by continuously changing the 
parameter s as a function of time t. 

Consider the set of all paths of A(s) and 5(s) from 
0 to 1 as a function of time t G [ 0 ,t], and denote it 
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F(t), where t is some finite time. Label one such path 
as / S F{t). In a noiseless coherent situation at OK, 
the unitarity of evolution dictates that the final state of 
the evolution is uniquely determined by the path /. In 
this situation, we may write the final pure state as a 
higher-order function of the path /, or |4'[/]). Thus any 
expectation values of the final state may be written as 
functionals of the path, {H) [/], and by the variational 
principle 

{Hp}[f] = {m\Hpm])>>^i (30) 

such that the optimal path is the path in F(t) that min¬ 
imizes the value of (iL) [/]. This functional minimization 
may be changed into a standard minimization by param¬ 
eterizing the path / by a set of parameters 9, and per¬ 
forming an optimization on the parameters 9 that deter¬ 
mine the path. As such, adiabatic state preparation may 
be considered as an ansatz to be used in the variational 
hybrid quantum-classical approach, where the state pa¬ 
rameters are the shape or nature of the path. The idea 
of refining the adiabatic path has been used before in the 
context of local adiabatic evolution [58] with great suc¬ 
cess. The idea here is to achieve similar benefits in an 
entirely black-box manner, guided only by a variational 
principle and measurements of the final point of the evo¬ 
lution. 

As a simple example, consider a linear path in F{t) de¬ 
fined by a single parameter 9i that controls how quickly 
the evolution is performed 

A(s) = max(l, dis) (31) 

B{s) = 1 - A{s) (32) 

and the parameter 9i is restricted by membership in F(t) 
to 1/r < 01 < oo. In the case of an ideal evolution 
with enough quantum resources such that the evolution 
is much longer than required by the problem gap, the 
adiabatic theorem implies that H{9i) is optimal at the 
extremal point 9i = 1/r. Moreover, in the limit that r —>■ 
oo, the adiabatic theorem implies that for any finitely 
gapped problem F{t) contains a path that prepares the 
exact ground state, and even the simplest linear paths, 
which are a subset of F{t), are sufficient to do so. 

Within this simple example, it’s not immediately clear 
why one would want the flexibility offered by the vari¬ 
ational quantum eigensolver formulation, as one could 
choose the linear path with 9i minimal without the need 
for any optimization of 0i. However, a more realistic situ¬ 
ation may be such that t is smaller than the required time 
of evolution dictated by the problem gap, due to techno¬ 
logical constraints or simply human time constraints in a 
hard problem. It might also be possible that no good es¬ 
timate of the gap is known, and one must attempt several 
paths regardless to establish confidence that the evolu¬ 
tion is not too fast to impair accuracy. One should exer¬ 
cise caution in such attempts however, as the probabil¬ 
ity of success does not necessarily increase monotonically 
with evolution time, especially when one is far short of 



FIG. 1. The ground and first excited state eigenvalues of 
the schedule Hamiltonian H{s) as a function of the annealing 
path A(s). This shows the avoided crossing that occurs at 
A(s) = 1/2, the size of which is controlled by the perturbation 
parameters e in the Hamiltonian, which in our example is set 
to a value of e = 0.1. 

the time required by the problem gap or when errors are 
present jSSj- Moreover, it is known that for systems ex¬ 
periencing decoherence or dephasing on the timescale of 
evolution that the slowest possible evolution is not opti¬ 
mal in preparing the ground state of the final problem 
Hamiltonian [SDHS2|. In all situations, the final density 
matrix is determined by the parameters of the path, such 
that / determines a density matrix p[f] = p{9), and an 
optimal choice of parameters can be made without de¬ 
tailed knowledge of the gap or errors present in a system 
by minimizing (Hp) [f] = (Hp) (9) = Tr[p(0)7Lp] as a 
function of 9. 

The Hamiltonians may also be generalized to include 
intermediate operators [5M55] such as 

H{s) = A{s)Hi + B{s)Hp + Cj{s)Hj (33) 

3 

where one considers any number of intermediate Hamil¬ 
tonians Hj and Cj with Cj{0) = Cj(I) = 0. The set 
of paths satisfying these boundary conditions with avail¬ 
able intermediate Hamiltonians {Hj}, F{t, {Hj}), offers 
more flexibility, and again a guiding principle to select 
parameters defining the optimal paths is given by the 
variational principle. 

From this discussion it is clear that adiabatic state 
preparation where the path of evolution is defined by 
some set of parameters 9 is one choice of parametric 
ansatz for the variational quantum eigensolver. It can be 
inferred from the known capabilities of adiabatic quan¬ 
tum computation that this ansatz is capable of prepar¬ 
ing states that cannot be efficiently prepared or sampled 
from classically using only a small number of parameters 
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with currently known methods |66j . As seen in the simple 
linear example, the number of parameters to meet this 
condition may be as few as 1 for a linear interpolation 
that is slow enough in ideal conditions. 

1. Variational Adiabatic Path Example 

To further illustrate the utility of a variational perspec¬ 
tive on adiabatic quantum computational methods in a 
resource constrained setting, we consider here a simple 
Tqubit problem first studied in the adiabatic context in 
the original work of Farhi et al [54) . In particular, we 
will consider this problem in a resource constrained con¬ 
text where the maximum evolution time r is limited. In 
this problem, the Hamiltonian the initial and problem 
Hamiltonians are given by 

= - az) + ecrx (34) 

= + (35) 

If we take the following form of the schedule Hamiltonian 

H{s) = A{s)H, + [1 - A(s)] Hp (36) 

then the eigenvalues of this problem undergo an avoided 
crossing with a gap determined by the size of the per¬ 
turbation e. For this example we choose e = 0.1 and 
the resulting spectrum is plotted in Figj^ as a function 
of A(s). Suppose that we are attempting to prepare the 
ground state of our problem Hamiltonian in a situation 
where the total evolution time t is limited. 

We will consider two types of paths, the first of which is 
a fixed standard linear path as a function of time. That 
is A(s) = s = t/r with t S [0,r]. The second type 
of path will be a parameterized path of two variables 
defined by the best cubic B-spline fit of the 4 points 
(0, 0), (.15r, 0i), (.85 t, 02), (t, 1), where the the parame¬ 
ters 9i are determined by a non-linear minimization the 
expectation value of the final state in the (possibly non- 
)adiabatic evolution with fixed maximum evolution time, 
(i7(l)) (01,02)- In this simple example we use the Nelder- 
Mead simplex method to perform a derivative free opti¬ 
mization of 0i, in analogy to how it might be performed 
on a quantum device. We use as an initial condition 
01 = .15r and 02 = .85r in the optimization, which cor¬ 
responds to the linear path. 

The resulting variationally optimal adiabatic spline 
path A(s) is plotted alongside the standard linear path 
in Fig. which shows that the method naturally finds a 
path which slows evolution near the closing gap, without 
any prior knowledge of the spectrum, and only measure¬ 
ments at the endpoint as opposed to the entire path. The 
effect of this on the success of preparing the ground state 
as a function of the total available evolution time is shown 
in Fig. From this figure we observe that the varia¬ 
tionally optimal adiabatic spline path is able to achieve 



FIG. 2. A comparison of the standard linear path A(s) ver¬ 
sus the two-parameter split path that is variationally optimal 
with respect to the expectation value of the Hamiltonian at 
the final point H(l). The path natnrally slows the evolution 
near the location of the avoided crossing, but is otherwise only 
slightly distorted from a standard linear path. 


similar results to a linear path with roughly 10 times less 
evolution time. That is, at the cost of some classical 
minimization, we have reduced the quantum evolution 
time requirement by a factor of 10 by slightly deform¬ 
ing the schedule in a black-box manner relying only on 
measurements of the final state of the evolution and no 
prior knowledge of the problem. Moreover, even at this 
reduced evolution time, we achieve the desirable property 
that the success of the computation is a monotonically 
increasing function of s, which is not true of the linear 
schedule in this case. 


2. Pontryagin’s Principle and Non-Adiabatic Bang-Bang 
Quantum Computation 

While adiabatic evolution or attempted adiabatic evo¬ 
lution is one way to prepare a desired state, it is certainly 
not the only option. Non-adiabatic evolution opens a dif¬ 
ferent class of potential schedules for preparing a desired 
state guided by the variational principle. The form of 
the schedule Hamiltonian H (s) has a particularly inter¬ 
esting form, namely that it is a linear evolution problem 
with a control A(s) that effects a linear coupling. In 
the theory of optimal control, it is known through ap¬ 
plication of Pontryagin’s minimization principle that the 
optimal control setting for reaching a desired state of 
the controlled system when the system has a linear cou¬ 
pling to the control is to have the control at its extremal 
values m- That is, A(s) becomes a sequence of step 
functions where it takes the values 0 or 1 and need not 
satisfy the previous boundary conditions A(0) = 1 and 






FIG. 3. The squared overlap of the system state |^'(s)) at pa¬ 
rameter value s with the ground state at H{1), l^*/) is show 
for both the standard Linear (Lin) schedule as well as the 
variationally optimal spline schedule for different total evo¬ 
lution times T. It can be seen here that the performance of 
the variational schedule offers similar performance to a lin¬ 
ear schedule roughly 10 times as long, indicating an order of 
magnitude reduction in the quantum evolution time required 
for the variationally optimal schedule. 


A(l) = 0. This class of solutions to optimal control prob¬ 
lems is known as a “bang-bang” solution, and is obviously 
non-adiabatic by construction. This principle has been 
shown in quantum optimal control outside of the con¬ 
text of quantum computation, where a Monte Carlo min¬ 
imization scheme was applied to determine the schedule 
of step functions, and a different variational principle was 
employed |68j . However this scheme could be straightfor¬ 
wardly adopted using the variational principle methods 
described here to engineer state preparation schedules for 
a state of interest, or to perform more general quantum 
computation. 


C. Unitary coupled cluster 

Another method to parametrically explore the Hilbert 
space of possible quantum states is the unitary coupled 
cluster method developed in quantum chemistry [331 El] . 
The projective non-unitary (and non-variational) form 
of these equations form the basis for the gold-standard 
of classical quantum chemistry, coupled cluster with sin¬ 
gle and double excitations with perturbative triple ex¬ 
citations [CCSD(T)] [331 EQ] and has its origins in nu¬ 
clear physics im. The unitary form of these equations 
do not have a well defined truncation as the projective 
form does, and one must rely on perturbative arguments 
to handle the BCH expansion that break down when the 
parameters defining the states grow. This ansatz for elec¬ 
tronic systems has been documented in classical quantum 


chemistry and in previous works on the variational quan¬ 
tum eigensolver [H |33l [331 El] , and here we document its 
generalization to generic collections of interacting two- 
level quantum systems, which include the anti-symmetric 
electronic case as a specialization. We note that coupled 
cluster has been utilized before in the context of frus¬ 
trated spin systems such as Kagome lattices (35] E2], but 
our treatment will extend beyond a fixed reference and 
also focus on the unitary variant of the method. 

To conceptually introduce the approach, recall the in¬ 
troduction of reference states earlier in this work, and 
consider a single computational reference state of an N- 
qubit quantum system, |$_ro) = |000...0). One way to 
parametrically explore Hilbert space is to consider the 
space of states “close” to \^ro) in the sense of Ham¬ 
ming distance or bit flips. This method, sometimes 
called configuration interaction (Cl) or state space re¬ 
striction enumerates available states through the use of 
spin flip 131 US]- For example, all states 1 flip away from 
|<i>ii;o) may be written as 

( 37 ) 

Pi 


where in this case 9i are complex coefficients and 
is the qubit raising operator applied to qubit p. This 
expansion can be extended systematically by including 
multi-qubit spin-flip operators to eventually parametrize 
all states in the Hilbert space, or full configuration in¬ 
teraction (FCI). While this parametric construction of 
states is straightforward, it has a number of deficiencies 
that render it non-optimal. We will not attempt to ex¬ 
plore all of those here, and note only that this ansatz is 
efficient to prepare and use classically for any truncation 
to a fixed number of spin flips k, and it is not clear that 
there is an advantage to specifically preparing a linear 
truncated state on a quantum device. 

An idea closely related to this is coupled cluster, which 
also uses the spin-flip concept to explore states “close” 
to a reference, but as a generator used in exploration of 
the space. In the case of quantum computing, its unitary 
variant is of particular interest, as unitary state prepa¬ 
ration is a natural operation on a quantum computer. 
Conventional implementations of coupled cluster often 
utilize a single, well defined reference state with all spins 
aligned, i.e. Idlijo) = |000...0). With this assumption, 
one may explore all of quantum space through successive 
flips in the computational basis. As a simple example, if 
one is interested in only real wavefunctions, the space of 
single spin flips may be explored by 


|^CCi(0)) = exp 


- Pi 



I'l’flo) 


(38) 


and successively larger fractions of the space of real wave- 
functions may be covered by introducing multiple spin 
flips. In the study of general quantum states however, 
it is sometimes necessary or more efficient to explore 
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quantum state space from an arbitrary reference |$_r), 
which could be entangled or simply more complex than 
These challenges have been studied in the con¬ 
text of multi-reference coupled cluster in quantum chem¬ 
istry m ST]. Moreover in quantum computation one 
may not have perfect knowledge of the reference state, 
nor want to require it in their algorithm. For example 
the reference state could be prepared by some adiabatic 
state preparation procedure. In this situation one could 
accidentally have as a reference state = |-|- -|- ...-|-) 
with |-|-) = l/-\/2(|0) -I- |1), from which no state explo¬ 
ration is possible with the above cluster operator. The 
space of non-trivial single qubit operators is spanned by 
a~^ ,a~ ,a^, I. As such we want to generalize to a set of 
anti-Hermitian operators spanning the same space, given 
by 

i(CT+ + a-) = ial = (^^. * ^ (39) 

{<^p - <^p ) \ 0 ) 

*<"^=( 0 -“*)^ (41) 

For convenience we have introduced the standard Pauli 

operators in the numerical indexing scheme, that is <7° = 
/, = X, = ay = Y, a^ = a^ = Z. As 

one is not typically interested in global phase factors, we 
implicitly ignore the identity operator in all equations 
going forward and with the remaining operators we may 
write the first order cluster operator as 

ri(0)=z^0“;a“; (42) 

piai 

where 0“^. are real, Roman indices pj indicate different 
qubits, and the Greek indices indicate different Pauli op¬ 
erator bases. More generally the fc’th order cluster oper¬ 
ator may be written as 


case at order k are the anti-Hermitian algebra su(2^) and 
the set of possible actions on the qubits are all possible 
unitary transformations on k qubits that leave the global 
phase unchanged, or SU(2*^). 

This represents a parametric state preparation with 
0{{3N)^) real parameters. While this has the potential 
to represent any known quantum operation at sufficient 
order and precision of implementation, practically speak¬ 
ing one often restricts to the case oik = 2, which has been 
found to be quite powerful in expressing states in quan¬ 
tum chemistry. This represents a powerful ansatz with 
a number of parameters that grows only quadratically in 
the size of the system. Additionally, the state prepara¬ 
tion is manifestly unitary by construction, and has no 
known efficient classical preparation or method for sam¬ 
pling with arbitrary (possibly entangled) reference |d>i^). 
As has been noted previously, this state can be prepared 
efficiently for any fixed order fc to a specified accuracy 
on a quantum device by using the Suzuki-Trotter factor¬ 
ization of the unitary operator exTp{T^^'> (9)) [II [74l [7^ . 
We note that as one is not trying to faithfully reproduce 
some dynamics as in many uses of the Suzuki-Trotter fac¬ 
torization, that a coarse factorization may suffice, simply 
altering the definition of the ansatz, but still remaining 
difficult to simulate classically. 

As an extension to the suggested implementation of 
spin unitary coupled cluster by Suzuki-Trotter, one may 
use the connection to su(2^) to take a more geometric 
approach and explore states through geodesic construc¬ 
tions as was done by Nielsen et al. Hg. Moreover if one 
allows values of different parameters at different Trotter 
steps, one may perform arbitrary 1 and 2 qubit gates at 
fc = 2, which forms a universal gate set and the ansatz 
can be made equivalent to an arbitrary quantum circuit 
with a sufficient number of Trotter steps. To see this, 
consider the first order in a Trotter factorization with a 
second order cluster operator and a Trotter number of 
N. One could prepare the desired state from a given 
reference |$ref) as 


Tk0)=iY,O 


SaS 

% I 


p,a 


(43) 


I'fcciO)) 


/ f)OClOL2 
_PiP2Q:iQ:2 ^ 


N 

|$ref) (46) 


where a^ = a^^...a^i;, 9% is a A:—index tensor con- 

p pi. Pk P 

taining the variational parameters, and the full cluster 
operator up to order k is written 

k 

T^^\9) =^T,{9) (44) 

i 

From this general cluster operator, we define the unitary 
coupled cluster state of order k with reference as 


where we emphasize that it is more correct to consider 
the use of the exponential splitting as a redefinition of the 
ansatz than an approximation. Instead of following this 
precise splitting procedure, where the same parameters 
are used in each Trotter step, one can relax the parame¬ 
ters to have independent values at each time step, and to 
not split Pauli operators acting on the same two qubits 
within one time step. This results in an ansatz of the 
form 


|4/[^^(0))=exp(T«(0))|$j^) (45) 

With this exposition it becomes clear that unitary cou¬ 
pled cluster generators for a totally general spin reference 


N 


i'i^cc(0))=n 

t 


n 

.P 1 P 2 


' / ^ P 1 P 2 
0 : 10:2 


it) 


P 1 P 2 


|4>ref) • 

( 47 ) 
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The operator defined by 

0 = (48) 

aioc2 

can express an arbitrary element in su(4) and thus its 
exponential exp(O) can be used to form an arbitrary two 
qubit gate on any two qubits, or said differently, an ar¬ 
bitrary element of SU (4) on any two qubits. Arbitrary 
two qubit gates on any qubit are known to constitute 
a universal gate set and then clearly can be used 
to construct any desired universal gate set such as the 
Clifford-|-T set. This establishes a clear connection be¬ 
tween second order unitary coupled cluster and universal 
quantum computation through relaxation of parameters 
in an exponential operator splitting. This also opens the 
research direction of connecting states of this type to ten¬ 
sor networks where the network is defined by the action 
at each “timestep” of unitary coupled cluster [75| . 

D. Fermionic UCC 

Due to particular interest in the quantum chemistry 
and other fermionic problems, it is worth discussing the 
specialization of this method to those cases. First taking 
again the case of a fixed computational reference, such 
as |4>ii;o) l)> ill analogy to the spin case, the first 

and second order cluster operators conventionally take 
on a simple form, that is 

T^^\9) = E (49) 

iipi 

i(0)= ^ al2®P2 ~ ) 

iii2PiP2 

(50) 

with ij indexing the occupied orbitals, pj indexing the 
unoccupied orbitals, and higher orders defined in the ob¬ 
vious way of including more excitation operators. These 
generators are constructed to conserve particle number 
at all orders and parametrically depend on 0(M^^) real 
parameters at order k. In the case of a single reference, 
it should be noted that all the excitation operators com¬ 
mute as a direct consequence of the creation and annihi¬ 
lation operators being restricted to act on different sub¬ 
spaces. As a result, Trotter factorization of this ansatz 
may be performed to arbitrary times exactly that allows 
one to explore regimes where low order truncations of 
the BCH expansion are not accurate and thus may be 
difficult to sample from classically. 

We can understand the equivalent action on qubits 
by mapping the fermion operators to spin operators via 
either the Jordan-Wigner or Bravyi-Kitaev transforma¬ 
tions discussed earlier in this work. In the case of the 
Jordan-Wigner mapping, as a result of the non-locality 
of these mappings, at every fermion order k, we find spin 
flips up to all N spins and observe that the allowed oper¬ 
ations on the qubits are a non-trivial subgroup of SU{2^) 


at every order k. This demonstrates that it is key to de¬ 
velop the ansatz in the fermionic framework before map¬ 
ping the problem to a spin representation. If one were 
to first map to spins, then use the spin coupled cluster 
formulation, the ansatz might explore many irrelevant 
or symmetry broken states, such as mixtures of different 
particle number states. 

In analogy to our exposition on spins however, this 
type of cluster operator is reference state specific. That 
is, there are some reference states from which it will fail 
to parameterize the entirety of the N fermion space and 
extensions to multi-reference states can require a dif¬ 
ferent cluster operator for each reference. This can be 
seen from dimension counting in the vector space of the 
fermion excitation operators. For example at first order 
these operators only span a real vector space of dimension 
M^/2 — M whereas the full space of all 1 fermion linear 
operators has real dimension . In classical implemen¬ 
tations of multi-reference coupled cluster there are many 
different approaches to solving this and related problems 
going by names such as “universal” or “state selective” 
mnlti-reference coupled cluster Hi ST]. In the case of 
unitary coupled cluster on a quantum computer, in anal¬ 
ogy to how we generalized the distinguishable spin op¬ 
erators, we can generalize the fermion operators to treat 
arbitrary references without such concerns. 

The operators a|a_,- and their tensor products where i 
and j run over all M orbitals (instead of restricting them 
to occupied and unoccupied relative to a reference) form 
a basis for the real vector space of operators on N fermion 
states. As a result, to allow arbitrary action on the space 
of N fermions, the span of the generating operators used 
must match this. To span the same real vector space as 
these operators we use the following anti-Hermitian basis 

i{a1paq + a\ap) = A < P < q < M (51) 

CpOq — ajop = ; I < p < q < M (52) 

and all possible A^—fold tensor products of these opera¬ 
tors. One can verify by dimension counting of the real 
vector space that these operators in fact span the en¬ 
tire space of possible fermion operators. With these op¬ 
erators, the first order fermion cluster operator can be 
written as 

TAe)=z E ( 53 ) 

PlIJlQ 

where pj and qj run over all orbitals and a indexes the 
anti-Hermitian fermion generators. Higher orders of the 
cluster operator can be built naturally from tensor prod¬ 
ucts of these operators, such that at the fc’th order we 
have 

TAB)=i^B%A% (54) 

where the same vector operator shorthand as the spin 
case has been used. With this construction the power of 
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the cluster operator is state agnostic, and fermion num¬ 
ber conserving. We term this the state universal quan¬ 
tum unitary coupled ansatz (SU-QUCA). Again, in all 
cases the optimal choice of the parameters 9 is deter¬ 
mined through the application of the variational principle 
with respect to the Hamiltonian of interest. 


E. Quantum Error Suppression and Symmetries 

A variational hybrid quantum-classical is designed to 
perform on pre-threshold computers, where gates may 
be imperfect and random bit flip or phase errors may be 
introduced into the computation. Fortunately the varia¬ 
tional formulation allows one to suppress certain types of 
errors naturally, which we will discuss here in the context 
of variational error suppression. 

In the design of a parametric wavefunction ansatz, it 
is common to enforce known symmetry requirements for 
both theoretical and practical purposes. For example, 
in the fermionic unitary coupled cluster wavefunctions, 
the ansatz is designed to conserve the number of parti¬ 
cles for all possible choices of the parameters 9. That is 
both the ansatz and the Hamiltonian commute with the 
number operator N = a\ai. While we haven’t explic¬ 
itly done so here, it is also possible to adapt the cluster 
operators to conserve total spin |4d) . In a fully error cor¬ 
rected quantum computer, this introduces no additional 
concerns and can simplify the problem under considera¬ 
tion. However in a pre-threshold device or any with only 
partial error correction this must be taken into consider¬ 
ation. 

Consider the preparation of an ansatz from some initial 
state, which we denote as Ua{9). In a pre-threshold, non¬ 
error corrected quantum device, there can be a distinc¬ 
tion between the formal specification of the ansatz prepa¬ 
ration Ua{9) as a gate or operation sequence and the op¬ 
eration sequence actually performed on the system with 
inputs 9, which we will denote Ua{9). We call an error in 
such an implementation suppressible if there exists a cor¬ 
rection input vector (3 such that \ \Ua{9) — Ua{9 + f3)\ \ < e 
for a specified e > 0, and further denote it variationally 
suppressible if the corrected vector 9+13 also corresponds 
to an optimum on the parameter surface. In such a case, 
the variational quantum eigensolver can suppress these 
errors naturally without detailed knowledge of the error 
mechanism. A troublesome non-suppressible case is when 
an error violates a symmetry of the ansatz. More explic¬ 
itly, if we denote the symmetries of the ansatz as the set 
of operators S such that [17a(0), S'] = 0 for all 9, then for 
any symmetry violating error 14 such that [C4,S] ^ 0, 
there does not exist any correction vector a such that the 
desired preparation can be performed. 

To be more concrete, consider the two examples given 
in this section, parameterized adiabatic state prepara¬ 
tion and coupled cluster. In these cases, some symme¬ 
tries of the ansatz can be trivially determined by the 



FIG. 4. A cartoon depicting the concept of variationally sup¬ 
pressible errors on energy contours. Dotted lines represent er¬ 
rors that move the state away from the variational minimum, 
and solid lines characterize a shift of the ansatz parameters 
that can return the state to the minimum. In this case the 
vertical axis is within the manifold of the ansatz parameters, 
while the horizontal axis is not, as indicated by the cross in 
the line returning along that axis. However by adding addi¬ 
tional operators, represented by the diagonal dashed line, it 
becomes possible to suppress these errors variationally. 


generating operators. In adiabatic state preparation, the 
symmetries will be given by the set of operators S such 
that [Hi,S] = 0 for all Hamiltonians Hi, including the 
initial, problem, and intermediate Hamiltonians. In the 
case of coupled cluster, this will be the set of operators 
S such that [£'^,5'] = 0 for all excitation type opera¬ 
tors Ei, such as the number operator. These represent 
sufficient conditions for [5', Ua (6*)] = 0 for every possible 
choice of 9. In the case of fermionic coupled cluster, the 
generating operators are specifically designed to conserve 
particle number, such that one symmetry of the system is 
the number operator N = ^ Jordan-Wigner 

qubit representation, this simply counts the number of 
qubits in state |0). As such, if a random error of the 
form Uf, = cicr^ is acted on any qubit, this error is not 
suppressible. 

This particular error can be made suppressible by ex¬ 
tending the set of generating operators to include spin 
flips (e.g. iap and i(J~) or fermionic non-number con¬ 
serving operators, e.g. (aj] — aq) and i(aj, + Uq) as well as 
all tensor products of these operators with the rest of the 
generating set. With the addition of these operators, this 
error become suppressible, however the error will only be 
variationally suppressible if the desired symmetry state 
of the ansatz corresponds to an energetic minimum. In 
the event that it does not, one can construct an auxiliary 
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Lagrangian of the form 

C = H + Y,^^iS^-sJf (55) 

i 

where Xi are penalty multipliers and Si are constants cor¬ 
responding to the desired expectation values of the op¬ 
erators Si- In order to be efficient, measurements corre¬ 
sponding and Si must be also be efficient. Using this 
construction, one may minimize with respect to expec¬ 
tation values (£) (0) instead of {H) (0), and in the limit 
that Ai —>■ oo the symmetries will be exactly preserved 
while allowing variational error suppression under action 
by the extended operator set. 

This methodology also allows for access to excited 
states that correspond to an energetic minima of a given 
symmetry. An example of this could be the lowest triplet 
energy state of a molecule with a natural singlet ground 
state, or the ionic state of a molecule after photodissoci¬ 
ation. Use of this construction may allow easier access to 
these particularly important excited states, as compared 
to a more general excited state approach. 

IV. OPERATOR AVERAGING 

Once a trial state |'I'(0)) has been prepared, the next 
crucial step in the VQE is the evaluation of the objec¬ 
tive function corresponding to the problem operator H, 
(H) (0) = (d>(0)| H |d>(0)). One possibility is to use the 
quantum phase estimation algorithm If |'I'(0)) is an 

eigenstate, then the value is obtained after a single state 
preparation with a cost in the desired precision of 0(l/e). 
Unfortunately, to achieve this precision, all of the oper¬ 
ations must be coherent which is a prohibitive techno¬ 
logical requirement for current and near-term quantum 
computers. Moreover, if the state is instead a mixture of 
many eigenstates, it will still require 0(l/e^) repetitions 
of the entire procedure to converge the value (H) (0) to 
a precision e. The use of quantum phase estimation done 
to a precision surpassing e opens the possibility to in¬ 
stead minimize the minimal value found in a projective 
measurement of the energy in a sequence of phase estima¬ 
tion runs. However we do not explore that option further 
here. 

In 2014, Peruzzo and McClean et. al [T] suggested 
a way to retain the advantage of preparing classically 
inaccessible states while removing the overwhelming co¬ 
herence time requirements to measure the energy. This 
method is called Hamiltonian averaging and has been 
discussed recently in more detail |21j . 

The original formulation used the fact that tensor 
products of Pauli operators form a basis for the space 
of Hermitian operators. As such any Hermitian operator 
H may be written as 

E + - (56) 


and by linearity the expectation value as 

ilOLl i\i20i20'2 

(57) 

As a result, all that is required is the weighted sum of 
the results from simple Pauli measurements. This is an 
operation requiring coherence time 0(1) assuming par¬ 
allel qubit rotation and readout are possible, otherwise 
the coherence time required is Oik) where k is the local¬ 
ity of the term to be measured. Previously, some scaling 
analysis of this procedure was done in the context of lo¬ 
cality m. but here we detail more specifically how to 
perform the averaging and verify the error on the fly in 
a simulation of a general state. 

Consider the Hamiltonian decomposed as 

H = (58) 

7 

where each is a Hermitian operator with associated 
measurement outcomes mi and m 2 , of which Pauli op¬ 
erators are a special case. In order to get the desired 
precision in a normal distribution approximation, we re¬ 
quire a variance of in the estimator of {H), which we 

denote with a large hat as {H). The estimator we have 
described is constructed as a sum of independent estima¬ 
tors {H^), 

{H)=Y,{K) (59) 

7 

each of which is a built a sequence of independent mea¬ 
surements X. As the measurements are taken from inde¬ 
pendent state preparations, we have that the covariance 
between thejndividiial estimators on the measurements 
is 0 or Cov[{Ha), (i7,s)] = 0 V a ^ /3 and thus the vari¬ 
ance of the total estimator is the sum of the variances of 
the individual estimators 

Var {H)=J2 Var[^]. (60) 

7 

The individual estimators are constructed as the mean of 
a sequence of independent measurements corresponding 
to the operator on independent preparations of the 
state p. Each measurement of the total operator requires 
a state preparation and measurement for each individual 
term, and thus the total number of expected state prepa¬ 
rations and measurements to achieve a precision of e in 
(H) fa 

= ( 61 ) 

7 

While this offers insight into how many measurements 
one expects to take, it does not yet constitute a practical 
algorithm, as the true value of the variances Xsi[H^] in 


*iai 
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general will be unknown except in toy examples. Instead, 
one has access to the sample mean and unbiased sample 
variance as the measurements are taken. That is, after n 
measurements {xi\ of the operator have been taken 
on p, one computes 

n 

i 

_ __ 1 ^ 

YaT:[H^]{{xi}) = - {H^){{xi})f (62) 

i 

and continues taking measurements until Var[(i/.y)] « 

Ydx[H^f\{{xi})/n < e^/M, and moves on to the next 
term. While straightforward, this methodology suffers 
from some ambiguities when using a small number of 
measurements or when the state p represents an eigen¬ 
state of the operator H^. In particular, how many mea¬ 
surements are required to confirm that the variance is 
0 to the desired precision. This is related to how unob¬ 
served events are addressed in a frequentist perspective of 
probability. In practical implementations these issues are 
often left unaddressed rigorously in stochastic sampling 
methods and a reasonable minimum number of measure¬ 
ments is chosen siJuI]L_as n = 1000 or n = 10000 before 
the estimates of yaT[H^]{{xi}) are taken to be reliable, 
trusting that after a number of samples that it is well 
represented by a normal distribution and the higher mo¬ 
ments associated with errors in estimates of the variance 
vanish rapidly. An alternative perspective that addresses 
such concerns from the outset is a Bayesian perspective, 
which has been investigated in the context of quantum 
phase estimation m, and we now explore in the context 
of Hamiltonian averaging. 


A. Bayesian Perspective 

In a Bayesian perspective, we start from an uninfor¬ 
mative prior for the distribution {Hj). In the case of two 
measurement outcomes, the likelihood function is the bi¬ 
nomial likelihood, and the posterior distributions after 
measurement can be worked out analytically when used 
with a conjugate Beta prior. These distributions are well- 
defined even for small numbers of measurements or when 
p is close to an eigenstate of resulting in potentially 
unobserved events in a sequence of measurements. 

Consider a sequence of independent measurements X 
with two possible outcomes {mi,m 2 }, such as the quan¬ 
tum measurement of a Pauli operator. The likelihood of 
observing the sequence of measurements X is completely 
defined by a single variable p, and is written 

P{X\p)= (63) 

with N being the total number of measurements X and 
r being the number of measurements equal to mi. The 


value p defines the probability of observing mi and will 
be directly related to {H^}. Our current knowledge of p 
is defined by the prior distribution P{p). Many choices 
for the form of the prior distribution can be made, but 
an analytical result can be obtained by choosing the con¬ 
jugate prior to the Binomial distribution, which is the 
Beta distribution 

P{p;a,P) = Beta(a,^) = - p)P-\ 

r(a)r(/3) 

(64) 


The Beta distribution is a function of two parameters 
a and /3, and these are the parameters we will seek to 
update with a Bayes inference scheme. Simply put, given 
the measurements X with r instances of mi, the posterior 
distribution is given by 

P{p\X) = Beta(Q! + r, (3 + N — r) = Beta(a', /3') (65) 


From a' and /3', one can determine both the mean value 
and variance in our desired quantity as 


(P) 

Var[p] 


a 

a + fi 

aj3 

{a + /?)^(a + (3 + f) 


( 66 ) 

(67) 


and the expected value and variance of p may be used in 
the estimators associated with H-^. In particular 


{H^) = (p) mi + {1- {p))m 2 ( 68 ) 

Var[(iJ.y)] = (mi — m 2 )^ Var[p] (69) 


A reasonable choice of initial prior in this situation before 
any measurements are taken is the uniform prior (some¬ 
times called the Bayes’ prior probability in this case) 
Beta(l, 1). Thus a practical strategy in the Bayes setting 
is to let a = /3 = 1, then take N measurements. One then 
updates a and /3 to a' and (3' according to eq. and 
continues taking measurements until Var[(i7.y)] < e^/M, 
which is simply computed as a function of the new a and 
(3 through the above formulae. We note that if one has 
a good reference state, a prior distribution can be con¬ 
structed from it to yield an informative prior. This has 
the potential to reduce the cost and will converge to the 
same result under most reasonable conditions. However 
one must be careful as this may introduce a bias for poor 
reference states with a small number of measurements 

After using either the frequentist or Bayesian approach 

to check convergence of Var[(i7..y)] for all 7 , under a nor¬ 
mal distribution approximation the final estimation of 
{H) is precise to the desired precision e. 

An alternative to the normal approximation confidence 
intervals may be used in the Bayesian approach if desired. 
As the measurements are taken for each of the operators 
in the Bayesian approach, the associated probability 

distribution P({H^)) is known. The probability distribu¬ 
tion of a sum of independent random variables is known 
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to be the convolution of the individual probability distri¬ 
butions, such that 

P((i7) = *P((^)) (70) 

7 

Unfortunately the convolution of two Beta distributions 
does not have a known analytical result, and these convo¬ 
lutions must be performed numerically. Once the proba¬ 
bility distribution P({H)) is known, one may numerically 
bracket the desired confidence interval to determine the 
precision of the approach. Practically speaking, the con¬ 
vergence of this final probability distribution to a normal 
distribution is quite rapid, and thus the normal approxi¬ 
mation relying on the variance is the standard procedure. 


threshold for each (H^) to be (1 — C^)e^/{M — k*). This 
results in a new expected number of measurements 


M-k* 


n 


* 

expect 


E 


7 


{M - k*)Var[H^] 

(1 - C2)e2 


(72) 


One is free to choose a value of C S [0,1) to maximize 
computational efficiency according to the particular con¬ 
straints of experiment and the distribution of operators 
in the sum. It has been seen previously that using this 
strategy in conjunction with locality information can po¬ 
tentially reduce the costs of quantum chemistry calcula¬ 
tions dramatically m- 


2. Commuting Groups and Correlated Sampling 


B. Cost Reduction 

The computational cost of Hamiltonian averaging can 
be reduced in a number of ways. In this section we will 
consider two methods for doing so. In the first we will 
remove terms that are deemed unimportant, and in the 
second we will consider how terms are grouped in order 
to reduce the required number of state preparations. 


1. Term Truncation 

The first strategy to reduce the number of measure¬ 
ments and state preparations required is to avoid mea¬ 
surements guaranteed not to contribute at the desired 
precision to the total estimate. To do this, one may or¬ 
der the terms by their expected maximum contribution to 
the estimate. For example the magnitude of a weighted 
Pauli operator = h^a is bounded such that for any 
state p, I (H^) I < \hj\. Once the terms are ordered ac¬ 
cording the the maximum expected contribution, with 
the maximum at 7 = M, we can construct the sequence 
of partial sums 

k 

efc = El^*l (71) 

I 

with Co defined to be 0, that defines the maximal bias in¬ 
troduced by truncating the k smallest terms. Using this 
sequence, one may choose a constant C S [0,1) and re¬ 
move the k* lowest terms by finding the maximal index k* 
in the sequence such that e^* < Ce. In this choice, C de¬ 
termines the both the number of terms one is allowed to 
neglect and amount of bias introduced. As the estimator 
is now biased, one must consider the bias-variance trade¬ 
off to maintain the desired accuracy. In order to achieve 
an expected mean-square-error of e in the final answer, 
we must decrease the variance of the estimatoiyorpthe re¬ 
maining terms such that -I- Var[(iJ.y)] < e^. 

This may be achieved by changing the per-term variance 


Another strategy one may use besides truncation is to 
take advantage of commuting operators within the sum 
to reduce the number of state preparations required. If 
two operators Ha and Hp commute, they may be mea¬ 
sured in sequence on the same state preparation without 
biasing the final result of the expectation values. As the 
state preparation is expected to be more expensive than 
projective measurements, this has the potential to offer 
significant savings. However, the application of this tech¬ 
nique requires some care. 

While grouping terms into commuting sets cuts down 
on the number of state preparations required for a sin¬ 
gle pass at the measurements and does not bias the ex¬ 
pected outcome, there is some detail to consider in the 
statistics of measurement and estimation of uncertainty. 
As terms within a commuting set are measured on the 
same state within each pass of the procedure, two oper¬ 
ators within a set may be correlated such that the es¬ 
timators of their average may have non-zero covariance 
i.e. Cov[(i7Q), {Hp)] ^ 0. This additional covariance can 
either require more measurements for the set of terms if 
the covariance is positive, or less if it is negative in anal¬ 
ogy to the method of antithetic variables or correlated 
sampling in classical Monte Carlo simulations [ 8 OI IM] . 
Thus one must be careful to group only operators that 
result in a practical efficiency gain. This concept is best 
illustrated with a short example. 

Consider the 2 spin Hamiltonian 

H = -{XiX^pYxY2) + ZiZ-2PZi + Z2 (73) 

where X, Y, Z are the standard Pauli operators and a 
quantum state 

Id/) = |01) (74) 

which we will be measuring. The operators in this Hamil¬ 
tonian can be grouped in a number of ways into groups of 
commuting terms. Consider the following three options 

1 . {-X1Y2}, {-Y1Y2}, {^1^2}, {Zi}, {Z2} 

2 . {-Y1X2}, {-Y1Y2, Z1Z2}, {Yi, Z2} 

3. {-A:iA:2,-yi,Y2,^i^2},{2i,^2}. 
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Using the formulas from the previous section to com¬ 
pute the expected number of state preparations for each 
grouping of operators to a precision e, we may proceed 
as follows. The expected estimator variance of the first 
grouping is 2, but prescribes a total number of state 
preparations per term to be 5 (from 5 sets of commut¬ 
ing operators), resulting in an expected number of state 
preparations riexpect-i = 10/e^. In the second case, we 
maintain the same variance, but group commuting op¬ 
erators together that have 0 covariance, so the number 
of preparations per iteration is reduced to 3 and we find 
^^expect -2 = 6/e^. The last case has the smallest number 
of commuting groups, but introduces an extra covariance 
term that results from covariance between X 1 X 2 , and 
Y 1 I 2 on the state |4'). As a result, the total number of 
expected preparations is given by riexpect-s = 8/e^. Thus 
while the last prescription had the fewest number of com¬ 
muting terms, the second was a better grouping, reducing 
cost by almost a factor of 2 from the naive measurement 
of all terms individually. 

This simple example illustrates how savings can be 
achieved through careful grouping, but also highlights 
the state and operator dependence of this strategy. The 
most crucial piece of information in deciding whether to 
group commuting terms is the covariance of different op¬ 
erators on the state. If one has a good approximation 
of the state, this can be estimated classically before an 
experiment to group operators that are expected to give 
cost savings. Alternatively, if one expects many points 
in an optimization to be similar, this can be estimated 
once on the quantum state before beginning to a low pre¬ 
cision, and these heuristic groupings can be used for the 
remainder of the experiment. Again, we emphasize that 
this strategy will not bias the final result, even if the sets 
chosen are non-optimal. It is merely a means of sampling 
cost reduction. 

Regardless of the strategy chosen, it is crucial to cor¬ 
rectly determine the statistical uncertainty of the final 
estimate. One could estimate the covariances from the 
measurements and account for this, but a perhaps con¬ 
ceptually simpler approach more true to the spirit of 
the experiments is to define new trivial estimators {Qi), 
which are constructed as follows. After a state prepa¬ 
ration, each operator in Qi is measured in turn in some 
pre-defined order to give a sequence {xj}. The sum of 
these measurements for all the operators is defined to 
be the new measurement qi — the estima¬ 

tor for the average over many realizations is simply the 
arithmetic mean, {Qi) = t^iis way the final 

estimator may be constructed equivalently as 

{H)=Y. m (75) 

i 

that clearly yields the same expectation value but is now 
composed of estimators such that Cov[((5i), (Qj)] = 0 for 
i ^ j-, allowing one to more conveniently estimate only 
variance of uncorrelated estimators to determine the un¬ 


certainty in the hnal estimate and fix the desired toler¬ 
ances per term when measuring. 

C. Beyond Energy to General Observables 

Finally we note that the method of calculating oper¬ 
ator averages outlined in this section often yields addi¬ 
tional information besides the original designed expecta¬ 
tion value. For example, in the case of quantum chem¬ 
istry, the individual operators measured that compose 
the Hamiltonian are the reduced 1 and 2 electron density 
matrices, defined for a state |'I') as 

D; = {^\alap\^} (76) 

= ( 77 ) 

Knowledge of these reduced density matrices is sufficient 
to determine not only the energy but the expectation 
value of any one- and two-electron operators, such as the 
dipole moment or charge density. This follows from the 
fact that any one- and two-electron operators F and G 
may be written in a basis as 

F = J2Ualap (78) 

ip 

G = gijpqola^japaq (79) 

ijpq 

where fij and giju are precomputed with the single par¬ 
ticle basis set. From this it is clear that the expectation 
values are 

{F) = Y,hp{M4cLp\^) = Y.^^v^l ( 80 ) 

ip ip 

(G) = 9ijpq ('f'l O.jo'jO'pO-q I'l') = 9ijpqF)Jq (81) 

ijpq ijpq 

which may be computed trivially on a classical com¬ 
puter with the measured values from experiment. Thus 
the operator averaging methodology in this section gives 
access to a number of interesting observables of the 
quantum system with no additional required measure¬ 
ments, and this approach can be viewed alternatively as 
a form of scalable partial tomography. This point of view 
also suggests that a promising route for additional post¬ 
processing of data is to use techniques designed to en¬ 
force physical constraints on the estimated reduced den¬ 
sity matrices [5^ . 

V. OPTIMIZATION OF 9 

The final piece of the variational quantum eigensolver 
is a method for updating the parameters 9 based on the 
measured value of the objective function of interest. The 
dependence of the objective function on the parameters 
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FIG. 5. The accuracy of the final energy of the optimized 
wavefunction at convergence compared to the known exact 
solution, as a function of the precision in the function value 
in the optimizer for different methods (e). The values are aver¬ 
aged over 20 repetitions and the error bars indicate 1 standard 
deviation of the measured data. The TOMLAB methods pro¬ 
vide dramatically superior performance at essentially all levels 
of measurement precision above e = .1. 


FIG. 6. The number of function evaluations required to reach 
convergence for minimization of the wave function as a func¬ 
tion of the precision in the function value. The accuracy of 
each of these minimizations relative to the exact answer is 
shown in Fig. The TOMLAB methods are seen to be dra¬ 
matically more efficient than the Nelder-Mead method, re¬ 
quiring sometimes 3 orders of magnitude less function evalua¬ 
tions to achieve higher accuracy in the final answer for higher 
desired precisions. 


will, of course, depend upon the ansatz being used and 
will in general be non-linear and non-convex. This is not 
to say ansatz satisfying desirable criteria such as convex¬ 
ity could not be designed, but rather that in general it 
may not be. As such, one may not expect global op¬ 
timization or verification of a proposed solution to be 
feasible. However, in many cases local optima are suffi¬ 
cient and prior knowledge of a problem offers high qual¬ 
ity starting points for the optimization. This has often 
been the case in quantum chemistry, where non-linear 
procedures such as Hartree-Fock utilize very good local 
optima and benefit greatly from high quality starting 
guesses. The use of high quality starting guesses will 
likely be important for all types of ansatz discussed here 
as well. In the case of UCC for example, perturbation 
theory methods such as MP2 could be used to generate 
starting guesses. 

The field of non-linear optimization is well developed 
with many tools both general and more specialized meth¬ 
ods to different optimization problems [83]. The ob¬ 
jective function by design here is statistical in nature, 
making it difficult to directly use many of the basic 
tools from numerical optimization that rely on gradi¬ 
ents. In the original implementation, the derivative 
free Nelder-Mead simplex method was used as it has 
reasonable robustness to small quantities of noise, at 
least in comparison to methods such as standard gra¬ 
dient descent. However, with developments in the op¬ 
timization of functions, it is clear that there are more 
efficient options available for this problem and in this 


work we compare the Nelder-Mead simplex method, 
TOMLAB/GLCLUSTER, TOMLAB/LGO, and TOM¬ 
LAB /MULTIMIN methods (HI [SS] for an example prob¬ 
lem. These particular algorithms were chosen because 
of Nelder-Mead’s use in the original work, and the supe¬ 
rior performance of the TOMLAB algorithms in a recent 
comprehensive benchmark of derivative free optimization 
techniques [84] . Each of the TOMLAB algorithms uses a 
different derivative free search strategy and include both 
global and local considerations in the choice of new iter¬ 
ates. Details of the TOMLAB algorithms can be found 
in the user’s guide [85] . 

The example problem we benchmark is this case is the 
optimization of a unitary coupled cluster wavefunction 
for H 2 in a minimal STO-3G basis. In these benchmarks, 
simulated measurement estimator noise is added to the 
objective function at a specified variance e^. The opti¬ 
mization is then repeated 20 times at a given e and the 
resulting accuracy with respect to the exact solution is 
plotted in Fig [^ as a function of the measurement noise, 
which can be controlled through the number of measure¬ 
ments taken in the experiment. The error bars indicate 
1 standard deviation in the distribution of values mea¬ 
sured over the 20 repetitions. Additionally, the number 
of evaluations of the expectation value of the energy re¬ 
quired to reach convergence is plotted as a function of 
the same precision e. It is seen in these plots that in all 
instances, the TOMLAB methods not only converge to a 
higher accuracy in the energy, but do sometime as many 
as 1000 times less function evaluations than the Nelder- 
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Mead method which was previously coupled to the varia¬ 
tional hybrid quantum-classical approach. Moreover, the 
approximately constant number of function evaluations 
required to reach convergence as a function of precision 
suggests that more savings may be reached by using a 
variable precision optimization, as the cost of a function 
evaluation to a precision e scales roughly as 1/e^ in this 
case. 

While the performance of the TOMLAB algorithms is 
impressive relative to previous standards, these methods 
that utilize some global optimization and random search 
strategies will require further numerical testing as the 
dimension of the problem space grows. Moreover, none of 
these methods were specifically designed for a stochastic 
objective function. This is an area of great importance 
in the algorithm as a whole, and all improvements can 
translate to dramatic savings in the overall runtime. As 
a result this is a topic of ongoing research. 


VI. CONCLUSIONS 

Quantum computers promise to change the way we 
think about problems across a plethora of different fields, 
including the important areas of optimization and eigen¬ 
value problems. While the construction of full scale, 
error corrected quantum devices still poses many tech¬ 
nical challenges, great progress is being made in their 
development. In the era of pre-threshold devices, and 
indeed beyond it, quantum devices may find an advan¬ 
tage in leveraging classical resources alongside quantum 
resources to exploit the powerful technologies already in 
existence today. The variational quantum eigensolver is 
an algorithm designed to exploit these resources in both 
a pre- and post-threshold world, and it has been specu¬ 
lated that variational algorithms of this type may be the 
first to demonstrate a quantum advantage over classical 
supercomputers for practical problems [86]. 

In this work, we explored the theory of a variational 
hybrid quantum-classical approach beyond its original 
context to more general problems. We explored two po¬ 
tential candidates for an ansatz that may allow one to 
go beyond classical computation, namely a variational 
adiabatic formulation and the unitary coupled cluster 
method. A simple connection between the second order 
unitary coupled cluster method and universal gate mod¬ 
els of quantum computation was demonstrated. More¬ 
over, we showed that the variational formalism allows 
for a natural form of error suppression for some quan¬ 
tum problems in a pre-threshold device. From a practical 
computational side, we showed that careful grouping of 
terms and truncation can offer significant cost savings in 
the use of this algorithm. Finally we improved the classi¬ 
cal subparts of the algorithm and found that advances in 
derivative free optimization offer dramatic cost savings 
over previous implementations. 

Only time will tell if variational algorithms will be the 
first to surpass classical computers and if they can ac¬ 


complish that feat on a pre-threshold device. Regardless 
of this outcome, the variational framework offers a pow¬ 
erful perspective for the development of tools through¬ 
out quantum computation and the perspectives we have 
investigated and extended in this work will aid in this 
endeavor. 
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VII. APPENDIX 

A. Fermion commutation relations 

Here we document the generic commutation relations 
from the interacting fermion Hamiltonian without assum¬ 
ing anything about whether an index corresponds to an 
occupied or unoccupied orbital in a reference state. 

[ajoj, aj^ajartts] = a\a\aras6pj — (83) 

d^Q^CL'pClj Sgi d^d^qCLsCljS^i 

~ ^l(^j^r^s{,^kq^lp ^kp^lq) (^ 4 ) 

j^q^kk^rkks^lp dpClgCl^Qyrdf^CliSs{ 
“t“ dj^CtjQj^dlCLj-dsSkp dpdgd^ dgdfcdlSri 
“t“ d^djdpdfi^drdsSlq d^pd)qdj^dj-d}^dl6sj 

^2 ~\~ dpd^d^dsdf^dlSj-j 


B. Eigenvector Bound 

In this section we derive the bound on the quality of the 
eigenvector stated in the text as determined by the vari¬ 
ance of the operator. The ground state is different than 
general eigenstates in allowing a slightly easier deriva¬ 
tion, so we split the derivations into two separate sub 
sections. 
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1. Ground State 


with eigenvalue Ai, we have 


Beginning with a calculation of the average energy in 
terms of the eigenvalues and weights of eigenvectors in 
a state |d>) decomposed into eigenvectors of i? as |d>) = 
T,^Ci IXi), 

{H) = jcipAi + |cipAi 

j>i 

> |cipAi + |cip(Ai + A) 

^>l 

= |cipAi + (1 — |cip)(Ai + A) 

= Ai + A - |cipA 

> (^{H) - \/vai{9)^ + A - IcipA (85) 


Var[7J] = {H-E)^\'i>) 

= ^(A, - S)2|c,p + (Afe - E)^\ck\‘^. (87) 

i^k 

where E = {H). Our goal is to bound the value of |cfcp 
based on a measured variance of the state with respect 
to H, Var[iJ] and a known bound on the gap A. Let 
a = (Afc — E)"^, from here we see that 

Var[i7] > (a + \/Var[7J])" (1 - \ckf) + a\ckf ( 88 ) 


where A is a lower bound on the gap between the ground 
and first excited eigenvalue. Rearranging yields the de¬ 
sired bound on the overlap with the ground state 


|cir> 



( 86 ) 


where the promise that the error is less than the gap, 
i.e. ^Var(0) < A guarantees a positive bound, and the 

overlap estimate converges to 1 as Var(0) is reduced to 

0 . 


rearranging to have an expression for |cfcp and letting 
7 = (a -I- ^/Yar[E[]] , we have 


|c/cP > 


7 - Var[i7] 

7 — a 


(89) 


Following our assumptions on the gap and errors, we 
know that and 0 < a < Var[iJ] < 7 , from which it 
follows that 


2. General States 

Starting with an expression for the variance of E[ over 
a state I'k) = \Xi) where \xi) are eigenvectors of H 


„ ,2 ^ 7 -Var[i7] 
C/c I _ 

7 


(90) 
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